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Summary 

The  goal  of  our  research  is  to  develop  practical  models  and  efficient  algo¬ 
rithms  to  analyze  the  reliability /availability /maintainability  of  complex  sys¬ 
tems  in  which  component  failures  are  statistically  dependent  and  each  com¬ 
ponent  is  subject  to  degradations  before  complete  failure.  The  Cause-Based 
Multimode  Model  (CBMM)  was  developed.  Practical  and  computationally 
tractable  solution  methods  were  designed.  In  addition,  an  ordered  enumera¬ 
tion  approach  was  developed  to  solve  the  network  survivability  enhancement 
problem.  Preliminary  computational  experiments  showed  that  this  approach 
is  very  efficient. 
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Research  Objectives 


Our  research  program  focuses  on  complex  systems  in  which  failures  are 
statistically  dependent  and  each  component  is  subject  to  degradations  before 
complete  failure.  The  goal  is  to  develop  practical  models  and  efficient  algo¬ 
rithms  to  analyze  the  reliability/availability/maintainability  and  to  define 
optimal  design,  management  and  maintenance  strategies. 

Accomplishments  and  Progress 

In  the  previous  research  period,  the  Event-Based  Reliability  Model  (EBRM) 
was  developed  for  the  reliability  modeling  and  analysis  of  real  systems  in 
which  component  failures  are  statistically  dependent.  Most  existing  reliabil¬ 
ity  models  assume  that  system  component  failures  are  statistically  indepen¬ 
dent.  This  assumption,  though  it  greatly  simplifies  the  problem,  is  often  not 
valid,  and  the  result  is  usually  an  overestimation  of  network  reliability.  Some 
researchers  have  tried  to  model  dependent  failures  by  conditional  probabili¬ 
ties  with  limited  success.  The  major  problem  is  that  an  exponentially  large 
number  of  parameters  have  to  be  dealt  with.  The  EBRM  does  not  make  use 
of  conditional  probabilities,  but  tries  to  model  explicitly  the  events  that  cause 
component  failures.  Major  advantages  of  EBRM  over  the  traditional  use  of 
conditional  probabilities  include  a  reduction  in  the  number  of  parameters  to 
be  handled  and  a  physically  more  meaningful  set  of  parameters.  We  have 
shown  that  the  EBRM  can  be  used  to  represent  exactly  the  same  kind  of  sta¬ 
tistical  dependencies  between  component  failures  as  described  by  any  given 
set  of  conditional  probabilities.  This  means  that  the  EBRM  is  a  completely 
general  model  which  can  be  applied  to  any  kind  of  failure  dependencies. 

We  have  also  developed  a  model  to  approximate  the  reliability  of  systems 
with  multimode  components.  Previous  research  on  reliability  has  been  fo¬ 
cused  on  models  which  assume  that  each  component  may  be  in  one  of  two 
modes,  namely,  operative  or  failed.  In  real  life,  a  component  may  undergo 
degradations  in  performance  before  a  complete  outage,  and  will  therefore 
operate  in  more  than  two  modes.  Since  it  has  been  proved  that  the  ex¬ 
act  calculation  of  system  reliability  (even  for  two-mode  models)  is  an  NP- 
complete  problem,  we  have  developed  an  approximation  method  to  calculate 
this  reliability  measure.  This  method  requires  us  to  work  with  the  states  of 
the  system  in  order  of  decreasing  probability.  An  algorithm  ORDER-M  has 
been  developed  to  generate  these  states  in  the  proper  order. 
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In  this  research  period,  we  have  developed  the  Cause-based  Multimode 
Model  (CBMM),  which  allows  one  to  consider  failure  dependencies  of  com¬ 
ponents  which  are  subject  to  degradations.  The  model  is  very  flexible  and 
general  and  has  physically  meaningful  parameters.  Practical  methods  to 
approximate  and  bound  network  reliability  and  performance  measures  have 
been  developed,  based  on  a  state  enumeration  approach.  The  methods  are 
very  efficient  when  components  are  reliable  -  which  is  the  case  in  most  real 
systems  -  because  only  a  small  fraction  of  the  total  number  of  states  need 
be  considered  to  achieve  a  very  good  approximation. 

In  the  framework  of  the  CBMM,  we  have  developed  a  new  algorithm  to 
enumerate  the  states  of  systems  having  dependent  failures  and  degraded  com¬ 
ponents,  which  requires  less  computing  time  and  memory  space.  We  have 
also  developed  a  path-based  approach  to  efficiently  approximate  reliability 
of  systems  having  a  path  structure.  A  system  has  a  path  structure  if  we 
can  identify  subsystems  called  paths  such  that  the  system  is  working  if  and 
only  if  there  is  at  least  one  working  subsystem.  This  assumption  is  usually 
applicable,  since  most  real  systems  exhibit  a  path  structure.  Tests  on  repre¬ 
sentative  examples  have  shown  that  the  path-based  approach  can  reduce  the 
processing  time  by  orders  of  magnitude.  The  Cause-based  Multimode  Model 
and  the  solution  methods  are  detailed  in  publications  #2,  #3,  #4  and  #5. 

An  important  issue  in  the  management  of  a  communication  network  is  the 
enhancement  of  an  existing  network.  We  focus  on  survivability  enhancement. 
Network  survivability  is  usually  expressed  in  terms  of  edge  (node)  connec¬ 
tivity,  i.e.  the  minimum  number  of  edges  (nodes)  that  must  be  removed  in 
order  to  disrupt  network  operation.  In  real  world  communication  networks, 
this  measure  is  inappropriate,  because  components  may  not  have  the  same 
probability  of  failure,  and  failures  may  be  correlated.  A  more  appropriate 
measure  is  the  probability  that  the  network  survives,  i.e.  keeps  on  operating 
successfully  in  the  presence  of  failures. 

Therefore,  we  formulate  the  problem  as:  given  an  existing  network,  with 
failprone  links,  find  a  minimum-cost  set  of  links  to  be  added  so  that  the 
probability  of  survival  is  at  least  a  specified  value. 

Solving  this  problem  requires  to  check  that  the  solution  satisfies  the  sur¬ 
vivability  constraint,  which  is  NP-hard.  Our  method  is  a  combination  of 
ordered  enumeration  and  heuristics.  The  ordered  enumeration  (OE)  gen¬ 
erates  the  possible  enhancements  in  order  of  increasing  cost,  and  for  each 
generated  enhancement,  tests  whether  the  resulting  network  satisfies  the  sur- 
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vivability  constraint.  The  first  satisfactory  enhancement  encountered  is  the 
cost-optimal  solution.  If  the  OE  cannot  find  an  optimal  solution  within  a 
predetermined  time,  then  the  heuristic  is  used  to  find  a  good  solution  within  a 
reasonable  computation  time.  The  heuristic  first  finds  a  feasible  solution,  i.e. 
an  enhancement  that  satisfies  the  survivability  constraint.  The  cost  is  then 
optimized  by  performing  a  cost  ordered  enumeration  of  all  the  enhancements 
which  are  subsets  of  the  feasible  solution.  The  first  satisfactory  enhancement 
found  is  the  solution. 

Since  the  number  of  enhancements  grows  exponentially  with  the  number 
of  candidate  links,  and  since  the  set  of  candidate  links  for  the  cost  ordered 
enumeration  of  the  heuristic  is  much  smaller  than  the  candidate  set  for  OE, 
the  heuristic  requires  much  less  time  than  OE,  especially  for  larger  networks. 

Tests  on  randomly  generated  networks  show  that: 

1.  The  heuristic  finds  a  good  solution  within  a  reasonable  computation 
time  for  networks  of  practical  size  (about  one  hour  on  the  average  for 
a  30  node  network). 

2.  The  heuristic  finds  an  optimal  solution  in  a  large  majority  of  cases 
(typically  more  than  80%  of  the  time). 

3.  For  small  enhancements,  the  OE  always  finds  the  optimum  within  a 
reasonable  time,  even  for  relatively  large  networks  (a  few  hours  on  the 
average  for  a  40  node  network). 

One  of  the  advantages  of  the  method  is  that,  unlike  most  heuristic  meth¬ 
ods,  one  can  assess  the  quality  of  the  solution,  since  the  OE  always  gives  a 
lower  bound  of  the  optimal  cost.  The  method  developed  here  can  also  be 
used  to  enhance  the  survivability  of  transportation  or  flow  networks.  Details 
can  be  found  in  publication  #1. 
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